General supervision of any election is provided by government-independent Central Election Commission (manned by judges mainly)
41 multi-member constituencies (7–19 seats). A total of 460 seats. As population of Poland is ca 38 million, there is 80 thousand people per seat (not voters)
Proportional voting system: All 460 members are elected by proportional representation, distribution of seats being effected on the basis of the modified Saint-Lague method;
Parties win seats according to the aggregate vote for their candidates in a constituency, and then allocate them to those with highest totals.
Election lists are registered by election committees (temporary body established by anyone interested to be elected, usually by parties/coalition of parties). Election list can contains maximum twice the number of seats (in every constituency)
For example Kwidzyn is in constutuency no 25 with 12 seats available. Every election committee can appoint up to 24 candidates.
There are thresholds for participation in allocation of seats: 5% of the total votes cast for party list; 8% for a coalition list. National minorities’ lists are exempt from thresholds requirements (still have to gain significant number of votes in a constituancy to win some seats).
Vacancies arising between general elections are filled by the individual who is “next-in-line” on the list of the party which formerly held the seat.
Voting is not compulsory.
100 single-member constituencies (1 senator per 200 thousand people.)
Majority: Single member plurality system (“first past the post”).
Vacancies arising between general elections are filled through by-elections (except in the last six months of the legislature’s term).
Voting is not compulsory.
Poland has 3thier administrative system which consists of: municipalities (gmina) counties (powiat) and province (voivodeships/województwo). There are 2478 gminas, 380 powiats and 16 voivodships.
Municipalities are run by a mayor (directly elected). Municipalties size varies enormously: the biggest gmina in Poland is Warsaw (1,5mln people, 5 mln EUR budget/year)
There are councils with very limited power at gmina/powiat level. At powiat level the council elects powiat head (starosta).
There are council with significant power at voivodship level. The council elects sort of local government and administer pretty lot of money.
Local elections: 2478 gmina mayors + councillors at all three above mentioned levels (together 4 elections).
Elections to voivodship councils are similar (in general) to elections to lower house of Parliament. Election list can contains the number of seats plus 2 (in every constituency) 85 multi-member constituencies (5–8 seats). A total of circa 560 seats are filled.
There are thresholds for participation in allocation of seats: 5% of the total votes cast for party list (8% for a coalition list – perhaps not sure).
The level is computed for each voivodeships not for the whole country.
As 90% or more seats are filled by mayor political parties, elections to VC can be regarded as another contest between Polish main political parties.
There are 16 provinces, 380 counties and 2750 communities in Poland (as you know already).
As Poland population is 38,5 mln and the area equals 312,7 sq kilometers (120 persons per 1 sqkm) on the average each powiat has 820 sqkm and each community has 113.5 sqkm or approximately 100 thousand persons per “powiat and 14 thousand per”gmina“.
Province in Polish is “prowincja” (due to both are from Latin) but actually Polish administrative provice is called “województwo”, from “wodzić” – ie commanding (the armed troops in this context). This is an old term/custom from the 14th century, where Poland was divided into provinces (every province ruled by a “wojewoda” ie chief of that province). More can be found at Wikipedia (cf Administrative divisions of Poland)
A Polish county (called “powiat”) is 2-nd level administrative unit. In ancient Poland powiat was called “starostwo” and the head of a “starostwo was called”starosta“.”Stary" means Old, so “starosta” is an old (and thus wise) person. BTW the head of powiat is “starosta” as 600 years ago:-)
TERYT is an official register of the territorial division of Poland It is complex system which includes identification of administrative units. Every unit has (up to) a 7-digit id number: wwppggt where ww = “województwo” id, pp = “powiat” id, gg = “gmina” id and “t” decodes type-of-community (rural, municipal or mixed). Higher units has trailing zeros for irrelevant part of id, so 14 or 1400000 means the same; as well as 1205 and 1205000. Six numbers is enough to identify a community (approx 2750 units).
Election system is two-thier 41/85 constituencies + 27,000 pooling stations. Constituencies are designed in such a way that preserves powiat boundries (ie. constituency consists of several powiats usually).
Thus in practice proportionality of election is approximate…
Primary source of data is Central Commision web page which all relevant registers: constituency/pooling stations details (addresses/boundries/number of inhabitants and voters), candidates/committees details, turnout, number of invalid votes, etc… (https://pkw.gov.pl/ and/or https://wybory.gov.pl/sejmsenat2019/)
https://nsd.no/european_election_database/country/poland/parliamentary_elections.html
File obwody2011pkw2019gus.csv contains data on 41 constituencies including population and number of seats (lmandatow), number of voters (wyborcy2019), population (mieszkancy2019 and ludnosc2018gus):
## total2019w total2019m.gus total2019m.pkw
## [1,] 30201280 38615580 36913667
The variable mieszkancy2019 contains population as registerd in PESEL system (cf https://en.wikipedia.org/wiki/PESEL) while ludnosc2018.gus contains population as estimated by Polish Central Statistical Office. Note the difference (1701913 or 4.6105227%). PKW uses PESEL-variant of population statistics.
Number of seats as assigned by PKW (based on PESEL/2011 register):
Number of residents per seat (GUS vs PKW, thousand residents): 83.946913 vs 80.2471022.
Finally lets plot the differences between and number of seats as assigned by PKW based on 2011 data and true number of seats as computed using PKW2019 data (voters) / GUS2018 data (residents) (ie assignedSeats - trueSeats). Negative difference represents lack of seats while positive one a surplus.
Turnout is computed as valid_votes + invalid_votes / registered_voters.
First compute basic descriptive statistics (fivenum returns min/q1/median/q3/max)
BTW lgw is number of valid votes (liczba głosów ważnych), freq is turnout (%) and pgnw is frequency of non valid votes (%):
fivenum(d$lgw)
## [1] 0 340 612 944 5239
mean(d$lgw, na.rm=TRUE)
## [1] 673.8887
fivenum(d$freq)
## [1] 0.00 52.38 59.28 66.05 100.00
mean(d$freq, na.rm=TRUE)
## [1] 59.24028
We can use histogram and frequency polygon (freqpoly) to plot relevant distributions:
Number of stations with low turnout (lgw < 25):
d <- subset (d, ( lgw < 25 ))
nrow(d)
## [1] 520
The file wp2019_komisje_wyniki.csv contains election results at pooling station level.
There are four mainstream political parties in Poland. PiS (Law and Justice, cf https://en.wikipedia.org/wiki/Law_and_Justice), PO (Civic Platform, cf https://en.wikipedia.org/wiki/Civic_Platform), PSL (agrarian People’s Party, cf https://en.wikipedia.org/wiki/Polish_People%27s_Party) and SLD (Left Alliance, cf https://en.wikipedia.org/wiki/Democratic_Left_Alliance)
First compute basic descriptive statistics (fivenum returns min/q1/median/q3/max)
## Wczytanie danych; obliczenie podst. statystyk:
komisje <- read.csv("wp2019_komisje_wyniki.csv",
colClasses = c( "teryt"="character"), sep = ';', header=T, na.string="NA");
fivenum(komisje$PSLp);
## [1] 0.00 5.53 8.20 12.42 92.59
fivenum(komisje$PiSp);
## [1] 0.00 35.67 46.39 60.64 100.00
fivenum(komisje$POp);
## [1] 0.00 11.84 23.08 32.92 84.62
Distribution of votes at pooling stations level:
Or with frequency polygon we can plot all four at once (comparison):
Compare support for PiS/PO/PSL/SLD by province (województwo):
Well known observed phenomenon of proportional scheme is concentration of votes at the top (or sometimes bottom) of the list. Elections in Poland are no exception to this.
The file kandydaci_sejm_2019_f4.csv contains total votes for every candidate from 4 mainstream Polish parties (PiS/PO/PSL/SLD – 4 riders of apocalypse in short 4RA). The data is organized as follows:
1;1;PSL;ŻUK Stanisław Adam;7694;1.78;24.81;T;M
1;2;PSL;SAMBORSKI Tadeusz;5703;1.32;18.39;N;M
...
Columns contains: constituency number, candidate number (on election ballot), candidate name, number of votes, %% of valid votes in constituency, %% of valid votes from candidate’s list, outcome (T=elected; N=not elected), sex (M=male, K=female)
We start from ploting a boxplot
BTW there are 3662 candidates from 4RA (out of 5100 in total). The readability of a chart is poor due to a few extreme values:
fivenum(kk$glosy)
## [1] 69 521 1144 3425 416030
We can remove extreme values (glosy >= 125000) and replot the chart. There are exactly 6 cases with glosy > 125000:
kk <- filter (kandydaci, glosy < 125000)
nrow(kk)
## [1] 3656
Replot the chart once more this time only for PiS and PO:
Final chart (4RA but only for Pomorskie province ie 25/26 constituences):
The concentration can be assessed not only visually, but estimated more precisley with some coefficient. The default measure of concentration is Gini Index. We will use other well known measure of concentration ie. Herfindahl-Hirschman Index (HH-Index) as it is much easier to compute (cf https://en.wikipedia.org/wiki/Herfindahl%E2%80%93Hirschman_Index)
The HHI formula is as follows \(HH = \sum_{i=1}^N x_i^2\) where \(\sum_{i=1}^N x_i =100\)%.
The file HHsejm_2015_2019.csv contains various concentration measures including HH-Index computed for every party/constituency/election (where constituency=1..41, and election={2005,2007,2011,2015,2019}). We can plot HH values for 2019 election with bar-plot:
#library(ggplot2)
#library(dplyr)
library(data.table)
options(dplyr.print_max = 1e9)
okr.scale <- seq(0, 42, by=2);
diff.scale <- seq(0, 30, by=2);
hh <- read.csv("HHsejm_2005_2019.csv", sep = ';', header=T, na.string="NA");
str(hh)
## 'data.frame': 205 obs. of 35 variables:
## $ rok : int 2005 2005 2005 2005 2005 2005 2005 2005 2005 2005 ...
## $ okr : int 1 2 3 4 5 6 7 8 9 10 ...
## $ hhPO : num 1062 1796 3659 1400 1317 ...
## $ maxPO : num 21.7 37 58.9 28.2 26.8 38.9 23.3 24.3 32.4 47.4 ...
## $ topPO : num 21.7 37 58.9 28.2 26.8 38.9 23.3 24.3 32.4 47.4 ...
## $ max3PO : num 44.9 58.6 74.6 54.5 53.8 60 43.4 50.5 64.2 65.2 ...
## $ top3PO : num 44.6 54.3 73.5 47.4 53.8 58.6 42.1 43.5 64.2 65.2 ...
## $ max6PO : num 70.1 77 84.3 78.5 72.8 77.4 61.9 69.7 79.9 81.9 ...
## $ top6PO : num 60.8 74.3 84.3 71 67.3 70.6 58.4 58.4 76 77.6 ...
## $ cum90PO : int 11 10 10 11 12 13 15 15 11 10 ...
## $ hhPSL : num 2272 2936 1070 1938 1258 ...
## $ maxPSL : num 43.8 51.4 25.9 39.9 29.5 17.8 15.7 44.7 18.9 34.3 ...
## $ topPSL : num 43.8 51.4 25.9 39.9 29.5 17.8 12.9 44.7 18.9 34.3 ...
## $ max3PSL : num 66.8 69.4 44.2 60.1 49 37.4 38.8 56.6 51.9 57.1 ...
## $ top3PSL : num 66.8 63.1 41.9 59.7 37.1 36.1 29 56.6 47.5 57.1 ...
## $ max6PSL : num 79 85.4 64.2 76.3 67.2 59.7 63.2 71.1 72.3 74.3 ...
## $ top6PSL : num 76.7 78.2 60.5 74 51.3 49.6 55.2 66.4 57.3 69.5 ...
## $ cum90PSL: int 12 8 15 13 15 17 14 13 12 12 ...
## $ hhPiS : num 1105 1237 2498 1254 1162 ...
## $ maxPiS : num 25.5 22.8 46.9 24.3 25.1 27.9 24.6 31.3 27.5 35.7 ...
## $ topPiS : num 25.5 22.8 46.9 24.3 25.1 27.9 24.6 31.3 27.5 35.7 ...
## $ max3PiS : num 47.6 50.4 66.8 56.4 49.7 45.7 45.9 53.4 58 58.6 ...
## $ top3PiS : num 42.3 50.4 64.4 56.4 47.8 42.4 45.9 47.1 49.6 58.6 ...
## $ max6PiS : num 66.8 69.2 80.6 72.8 71.9 61 65.5 71.8 73.6 75.6 ...
## $ top6PiS : num 65 64.6 75 72.8 71.9 58.4 60.4 66.6 71.6 66.4 ...
## $ cum90PiS: int 15 11 12 15 15 17 15 13 12 11 ...
## $ hhSLD : num 2954 2339 2018 4165 1552 ...
## $ maxSLD : num 48 36.5 39.1 63.3 32 30.2 26.6 17.7 30 28.2 ...
## $ topSLD : num 48 23.2 39.1 63.3 17.3 26.4 26.6 16.9 30 28.2 ...
## $ max3SLD : num 79.1 77.5 65.5 78.6 58.7 64.1 48.7 48.7 60 53.9 ...
## $ top3SLD : num 72.7 66.7 65.5 73.1 56.3 64.1 48.7 35.2 56.9 46.9 ...
## $ max6SLD : num 91 96.1 81.7 87.7 75.1 76.4 73.7 73.2 82.3 78.8 ...
## $ top6SLD : num 91 95 79.8 81.9 66.5 73.3 71.9 62.4 74.8 73 ...
## $ cum90SLD: int 6 5 11 7 14 15 12 14 9 10 ...
## $ X : logi NA NA NA NA NA NA ...
hh.okr <- hh %>% group_by(okr) %>% summarise_at(vars("hhPO", "hhPiS", "hhSLD", "hhPSL"), mean) %>% as.data.frame
str(hh.okr)
## 'data.frame': 41 obs. of 5 variables:
## $ okr : int 1 2 3 4 5 6 7 8 9 10 ...
## $ hhPO : num 1560 2158 3229 2467 1303 ...
## $ hhPiS: num 1206 1945 1917 1619 1420 ...
## $ hhSLD: num 3080 2154 1984 2686 2572 ...
## $ hhPSL: num 1453 2143 1272 1939 1296 ...
ggplot(hh.okr, aes(x = okr)) +
ggtitle("Average HH 2005--2019",
subtitle="3:Wroc 9:Łódź, 19/20:Wwa 12/13:Krk 15:Tarnów 23: Rzeszów") +
###
geom_point(aes(y = hhPO, colour = 'PO'), size=2, alpha=.5) +
geom_point(aes(y = hhPiS, colour = 'PiS'), size=2, alpha=.5) +
geom_point(aes(y = hhSLD, colour = 'SLD'), size=2, alpha=.5) +
geom_point(aes(y = hhPSL, colour = 'PSL'), size=2, alpha=.5) +
## labels
ylab(label="HH") +
xlab(label="constituence") +
labs(colour = "Party: ") +
scale_x_continuous(breaks=seq(0,42, by=2)) +
scale_y_continuous(breaks=seq(0,8000,by=1000), limits = c(0,8000))+
theme(legend.position="top") +
theme(legend.text=element_text(size=12));
The same as above but for 2019 only:
Extreme concentration can cause some odd results. For example MPs are elected with extremly low support (a few thousand votes):
kk <- filter (kandydaci, glosy < 6666 & wynik == "T")
print (kk)
## okr nrk komitet kandydat glosy procent lprocent wynik
## 1 1 2 SLD OBAZ Robert Marcin 4205 0.97 5.92 T
## 2 2 3 PiS MURDZEK Wojciech Piotr 6354 2.25 5.54 T
## 3 18 8 PiS GÓRSKI Maciej 5206 1.15 1.92 T
## 4 19 4 PO FABISIAK Joanna 5347 0.39 0.92 T
## 5 19 13 PO JACHIRA Klaudia Krystyna 6434 0.47 1.11 T
## 6 21 4 PO WILCZYŃSKI Ryszard Jan 6436 1.58 5.93 T
## 7 25 7 PO BOROWCZAK Jerzy Stanisław 5491 1.04 2.51 T
## 8 31 4 PiS SOŚNIERZ Andrzej Stanisław 6061 1.29 3.29 T
## 9 31 7 PiS WESOŁY Marek Jacek 6091 1.30 3.31 T
## 10 33 5 PiS KWITEK Marek 5453 0.96 1.73 T
## 11 41 4 PO ŁĄCKI Artur Jarosław 5668 1.20 3.37 T
## plec
## 1 M
## 2 M
## 3 M
## 4 K
## 5 K
## 6 M
## 7 M
## 8 M
## 9 M
## 10 M
## 11 M
wgpp – voters voting by proxy (disabled persons); wgnpzpg – voters voting outside district of permanent residence
kk <- read.csv("komisje-wyniki-frekwencja-2019.csv", sep = ';', header=T, na.string="NA");
sum(kk$wgnpzpg)
## [1] 202792
pz.pis <- ggplot(kk, aes(x = wgnpzpg, y=PiSp )) +
geom_point(colour = 'blue', alpha=.4) +
ggtitle("PiS vs wgnpzpg") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab(label="wgnpzpg") +
ylab(label="PiSp") +
scale_y_continuous(limits = c(0, 100)) +
geom_smooth(method = "lm", colour = 'black')
pz.po <- ggplot(kk, aes(x = wgnpzpg, y=POp )) +
geom_point(colour = '#E69F00', alpha=.4) +
ggtitle("PO vs wgnpzpg") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab(label="wgnpzpg") +
ylab(label="PO") +
scale_y_continuous(limits = c(0, 100)) +
geom_smooth(method = "lm", colour = 'black')
pz.sld <- ggplot(kk, aes(x = wgnpzpg, y=SLDp )) +
geom_point(colour = '#56B4E9', alpha=.4) +
ggtitle("SLD vs wgnpzpg") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab(label="wgnpzpg") +
ylab(label="SLD") +
scale_y_continuous(limits = c(0, 100)) +
geom_smooth(method = "lm", colour = 'black')
pz.psl <- ggplot(kk, aes(x = wgnpzpg, y=PSLp )) +
geom_point(colour = '#636363', alpha=.4) +
ggtitle("PSL vs wgnpzpg") +
theme(plot.title = element_text(hjust = 0.5)) +
xlab(label="wgnpzpg") +
ylab(label="PSL") +
scale_y_continuous(limits = c(0, 100)) +
geom_smooth(method = "lm", colour = 'black')
ggarrange(pz.pis, pz.po, pz.sld, pz.psl, ncol = 2, nrow = 2)
PO/SLD have more determined voters?
Not neccessary. PO/SLD is more popular in cities and wgnpzpg voters are urban not rural phenomenon…
bar.color <- "blue";
okr.scale <- seq(0, 42, by=2);
okr.scaleD <- seq(0, 42, by=1);
diff.scale <- seq(0, 20, by=2);
diffp.scale <- seq(0, 100, by=10);
kk <- read.csv("kk2015_2019.csv", sep = ';', header=T, na.string="NA");
kk <- kk %>% filter(komitet=="PiS" | komitet=="PO" | komitet=="SLD" | komitet=="PSL" | komitet == "KONF") %>% as.data.frame
kk$diffm <- kk$m2019 - kk$m2015
p0 <- ggplot(data=kk, aes(x=as.factor(komitet), y=diffm, fill=komitet )) +
geom_bar(stat="identity", position=position_dodge(width=.4), width=.8, alpha=.5) +
geom_text(data=kk, aes(label=sprintf("%s", diffm), y= diffm), vjust=-1.25, size=3 ) +
theme(legend.position="top") +
ylab(label="m2019 - m2015") +
xlab(label="") +
labs(caption = "Source: https://github.com/hrpunio/Data/tree/master/sejm2019") +
ggtitle("Seats (2019 vs 2015)")
p1 <- ggplot(data=kk, aes(x=as.factor(komitet), y=m2019, fill=komitet )) +
geom_bar(stat="identity", position=position_dodge(width=.4), width=.8, alpha=.5) +
geom_text(data=kk, aes(label=sprintf("%s", m2019), y= m2019), vjust=-1.25, size=3 ) +
#scale_x_continuous(breaks=seq(1,41,by=1), limits = c(1,41)) +
#scale_y_continuous(breaks=seq(-8,18,by=2), limits = c(-3,15)) +
theme(legend.position="top") +
ylab(label="m2019") +
xlab(label="") +
labs(caption = "Source: https://github.com/hrpunio/Data/tree/master/sejm2019") +
ggtitle("Seats 2019")
p1
p0
Election commitees are in fact coalitions (formal or informal to lower entry threshold)
If we are counting elected party members (not election commitees) the figures are as follows (4RA):
We have described election results in Poland with bar plot, box plot, dot plot, scatter plot, histogram and frequency polygon. We do not used pie chart as it is not recommended :-)